83 research outputs found

    Are we there yet? : reliably estimating the completeness of plant genome sequences

    Get PDF
    Genome sequencing is becoming cheaper and faster thanks to the introduction of next-generation sequencing techniques. Dozens of new plant genome sequences have been released in recent years, ranging from small to gigantic repeat-rich or polyploid genomes. Most genome projects have a dual purpose: delivering a contiguous, complete genome assembly and creating a full catalog of correctly predicted genes. Frequently, the completeness of a species' gene catalog is measured using a set of marker genes that are expected to be present. This expectation can be defined along an evolutionary gradient, ranging from highly conserved genes to species-specific genes. Large-scale population resequencing studies have revealed that gene space is fairly variable even between closely related individuals, which limits the definition of the expected gene space, and, consequently, the accuracy of estimates used to assess genome and gene space completeness. We argue that, based on the desired applications of a genome sequencing project, different completeness scores for the genome assembly and/or gene space should be determined. Using examples from several dicot and monocot genomes, we outline some pitfalls and recommendations regarding methods to estimate completeness during different steps of genome assembly and annotation

    A molecular timetable for apical bud formation and dormancy induction in poplar

    Get PDF
    The growth of perennial plants in the temperate zone alternates with periods of dormancy that are typically initiated during bud development in autumn. In a systems biology approach to unravel the underlying molecular program of apical bud development in poplar (Populus tremula 3 Populus alba), combined transcript and metabolite profiling were applied to a high-resolution time course from short-day induction to complete dormancy. Metabolite and gene expression dynamics were used to reconstruct the temporal sequence of events during bud development. Importantly, bud development could be dissected into bud formation, acclimation to dehydration and cold, and dormancy. To each of these processes, specific sets of regulatory and marker genes and metabolites are associated and provide a reference frame for future functional studies. Light, ethylene, and abscisic acid signal transduction pathways consecutively control bud development by setting, modifying, or terminating these processes. Ethylene signal transduction is positioned temporally between light and abscisic acid signals and is putatively activated by transiently low hexose pools. The timing and place of cell proliferation arrest (related to dormancy) and of the accumulation of storage compounds (related to acclimation processes) were established within the bud by electron microscopy. Finally, the identification of a large set of genes commonly expressed during the growth-to-dormancy transitions in poplar apical buds, cambium, or Arabidopsis thaliana seeds suggests parallels in the underlying molecular mechanisms in different plant organs

    Population genomic structure of the gelatinous zooplankton species Mnemiopsis leidyi in its nonindigenous range in the North Sea

    Get PDF
    Nonindigenous species pose a major threat for coastal and estuarine ecosystems. Risk management requires genetic information to establish appropriate management units and infer introduction and dispersal routes. We investigated one of the most successful marine invaders, the ctenophore Mnemiopsis leidyi, and used genotyping-by-sequencing (GBS) to explore the spatial population structure in its nonindigenous range in the North Sea. We analyzed 140 specimens collected in different environments, including coastal and estuarine areas, and ports along the coast. Single nucleotide polymorphisms (SNPs) were called in approximately 40 k GBS loci. Population structure based on the neutral SNP panel was significant (F-ST .02; p < .01), and a distinct genetic cluster was identified in a port along the Belgian coast (Ostend port; pairwise F-ST .02-.04; p < .01). Remarkably, no population structure was detected between geographically distant regions in the North Sea (the Southern part of the North Sea vs. the Kattegat/Skagerrak region), which indicates substantial gene flow at this geographical scale and recent population expansion of nonindigenous M. leidyi. Additionally, seven specimens collected at one location in the indigenous range (Chesapeake Bay, USA) were highly differentiated from the North Sea populations (pairwise F-ST .36-.39; p < .01). This study demonstrates the utility of GBS to investigate fine-scale population structure of gelatinous zooplankton species and shows high population connectivity among nonindigenous populations of this recently introduced species in the North Sea. OPEN RESEARCH BADGES This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at: The DNA sequences generated for this study are deposited in the NCBI sequence read archive under SRA accession numbers -, and will be made publically available upon publication of this manuscript

    Whole-genome deep sequencing reveals host-driven in-planta evolution of Columnea Latent Viroid (CLVd) quasi-species populations

    Get PDF
    Columnea latent viroid (CLVd) is one of the most serious tomato diseases. In general, viroids have high mutation rates. This generates a population of variants (so-called quasi-species) that co-exist in their host and exhibit a huge level of genetic diversity. To study the population of CLVd in individual host plants, we used amplicon sequencing using specific CLVd primers linked with a sample-specific index sequence to amplify libraries. An infectious clone of a CLVd isolate Chaipayon-1 was inoculated on different solanaceous host plants. Six replicates of the amplicon sequencing results showed very high reproducibility. On average, we obtained 133,449 CLVd reads per PCR-replicate and 79 to 561 viroid sequence variants, depending on the plant species. We identified 19 major variants (>1.0% mean relative abundance) in which a total of 16 single-nucleotide polymorphisms (SNPs) and two single nucleotide insertions were observed. All major variants contained a combination of 4 to 6 SNPs. Secondary structure prediction clustered all major variants into a tomato/bolo maka group with four loops (I, II, IV and V), and a chili pepper group with four loops (I, III, IV and V) at the terminal right domain, compared to the CLVd Chaipayon-1 which consists of five loops (I, II, III, IV and V)

    Identification and assessment of variable single-copy orthologous (SCO) nuclear loci for low-level phylogenomics: a case study in the genus Rosa (Rosaceae)

    Get PDF
    International audienceBackground: With an ever-growing number of published genomes, many low levels of the Tree of Life now contain several species with enough molecular data to perform shallow-scale phylogenomic studies. Moving away from using just a few universal phylogenetic markers, we can now target thousands of other loci to decipher taxa relationships. Making the best possible selection of informative sequences regarding the taxa studied has emerged as a new issue. Here, we developed a general procedure to mine genomic data, looking for orthologous single-copy loci capable of deciphering phylogenetic relationships below the generic rank. To develop our strategy, we chose the genus Rosa, a rapid-evolving lineage of the Rosaceae family in which several species genomes have recently been sequenced. We also compared our loci to conventional plastid markers, commonly used for phylogenetic inference in this genus

    Establishment of CRISPR/Cas9 genome editing in witloof (Cichorium intybus var. foliosum)

    Get PDF
    Cichorium intybus var. foliosum (witloof) is an economically important crop with a high nutritional value thanks to many specialized metabolites, such as polyphenols and terpenoids. However, witloof plants are rich in sesquiterpene lactones (SL) which are important for plant defense but also impart a bitter taste, thus limiting industrial applications. Inactivating specific genes in the SL biosynthesis pathway could lead to changes in the SL metabolite content and result in altered bitterness. In this study, a CRISPR/Cas9 genome editing workflow was implemented for witloof, starting with polyethylene glycol (PEG) mediated protoplast transfection for CRISPR/Cas9 vector delivery, followed by whole plant regeneration and mutation analysis. Protoplast transfection efficiencies ranged from 20 to 26 %. A CRISPR/Cas9 vector targeting the first exon of the phytoene desaturase (CiPDS) gene was transfected into witloof protoplasts and resulted in the knockout of CiPDS, giving rise to an albino phenotype in 23% of the regenerated plants. Further implementing our protocol, the SL biosynthesis pathway genes germacrene A synthase (GAS), germacrene A oxidase (GAO), and costunolide synthase (COS) were targeted in independent experiments. Highly multiplex (HiPlex) amplicon sequencing of the genomic target loci revealed plant mutation frequencies of 27.3, 42.7, and 98.3% in regenerated plants transfected with a CRISPR/Cas9 vector targeting CiGAS, CiGAO, and CiCOS, respectively. We observed different mutation spectra across the loci, ranging from consistently the same +1 nucleotide insertion in CiCOS across independent mutated lines, to a complex set of 20 mutation types in CiGAO across independent mutated lines. These results demonstrate a straightforward workflow for genome editing based on transfection and regeneration of witloof protoplasts and subsequent HiPlex amplicon sequencing. Our CRISPR/Cas9 workflow can enable gene functional research and faster incorporation of novel traits in elite witloof lines in the future, thus facilitating the development of novel industrial applications for witloof

    Utilization of Tissue Ploidy Level Variation in de Novo Transcriptome Assembly of Pinus sylvestris

    Get PDF
    Compared to angiosperms, gymnosperms lag behind in the availability of assembled and annotated genomes. Most genomic analyses in gymnosperms, especially conifer tree species, rely on the use of de novo assembled transcriptomes. However, the level of allelic redundancy and transcript fragmentation in these assembled transcriptomes, and their effect on downstream applications have not been fully investigated. Here, we assessed three assembly strategies for short-reads data, including the utility of haploid megagametophyte tissue during de novo assembly as single-allele guides, for six individuals and five different tissues in Pinus sylvestris. We then contrasted haploid and diploid tissue genotype calls obtained from the assembled transcriptomes to evaluate the extent of paralog mapping. The use of the haploid tissue during assembly increased its completeness without reducing the number of assembled transcripts. Our results suggest that current strategies that rely on available genomic resources as guidance to minimize allelic redundancy are less effective than the application of strategies that cluster redundant assembled transcripts. The strategy yielding the lowest levels of allelic redundancy among the assembled transcriptomes assessed here was the generation of SuperTranscripts with Lace followed by CD-HIT clustering. However, we still observed some levels of heterozygosity (multiple gene fragments per transcript reflecting allelic redundancy) in this assembled transcriptome on the haploid tissue, indicating that further filtering is required before using these assemblies for downstream applications. We discuss the influence of allelic redundancy when these reference transcriptomes are used to select regions for probe design of exome capture baits and for estimation of population genetic diversity.Peer reviewe

    Cow responses and evolution of the rumen bacterial and methanogen community following a complete rumen content transfer

    No full text
    Understanding the rumen microbial ecosystem requires the identification of factors that influence the community structure, such as nutrition, physiological condition of the host and host-microbiome interactions. The objective of the current study was to describe the rumen microbial communities before, during and after a complete rumen content transfer. The rumen contents of one donor cow were removed completely and used as inoculum for the emptied rumen of the donor itself and three acceptor cows under identical physiological and nutritional conditions. Temporal changes in microbiome composition and rumen function were analysed for each of four cows over a period of 6 weeks. Shortly after transfer, the cows showed different responses to perturbation of their rumen content. Feed intake depression in the first 2 weeks after transfer resulted in short-term changes in milk production, methane emission, fatty acid composition and rumen bacterial community composition. These effects were more pronounced in two cows, whose microbiome composition showed reduced diversity. The fermentation metrics and microbiome diversity of the other two cows were not affected. Their rumen bacterial community initially resembled the composition of the donor but evolved to a new community profile that resembled neither the donor nor their original composition. Descriptive data presented in the current paper show that the rumen bacterial community composition can quickly recover from a reduction in microbiome diversity after a severe perturbation. In contrast to the bacteria, methanogenic communities were more stable over time and unaffected by stress or host effects

    Orthology guided transcriptome assembly of Italian ryegrass and meadow fescue for single-nucleotide polymorphism discovery

    Get PDF
    Single-nucleotide polymorphisms (SNPs) represent natural DNA sequence variation. They can be used for various applications including the construction of high-density genetic maps, analysis of genetic variability, genome-wide association studies, and mapbased cloning. Here we report on transcriptome sequencing in the two forage grasses, meadow fescue (Festuca pratensis Huds.) and Italian ryegrass (Lolium multiflorum Lam.), and identification of various classes of SNPs. Using the Orthology Guided Assembly (OGA) strategy, we assembled and annotated a total of 18,952 and 19,036 transcripts for Italian ryegrass and meadow fescue, respectively. In addition, we used transcriptome sequence data of perennial ryegrass (L. perenne L.) from a previous study to identify 16,613 transcripts shared across all three species. Large numbers of intraspecific SNPs were identified in all three species: 248,000 in meadow fescue, 715,000 in Italian ryegrass, and 529,000 in perennial ryegrass. Moreover, we identified almost 25,000 interspecific SNPs located in 5343 genes that can distinguish meadow fescue from Italian ryegrass and 15,000 SNPs located in 3976 genes that discriminate meadow fescue from both Lolium species. All identified SNPs were positioned in silico on the seven linkage groups (LGs) of L. perenne using the GenomeZipper approach. With the identification and positioning of interspecific SNPs, our study provides a valuable resource for the grass research and breeding community and will enable detailed characterization of genomic composition and gene expression analysis in prospective Festuca Lolium hybrids

    Industrial chicory genome gives insights into the molecular timetable of anther development and male sterility

    Get PDF
    Industrial chicory (Cichorium intybus var. sativum) is a biannual crop mostly cultivated for extraction of inulin, a fructose polymer used as a dietary fiber. F1 hybrid breeding is a promising breeding strategy in chicory but relies on stable male sterile lines to prevent self-pollination. Here, we report the assembly and annotation of a new industrial chicory reference genome. Additionally, we performed RNA-Seq on subsequent stages of flower bud development of a fertile line and two cytoplasmic male sterile (CMS) clones. Comparison of fertile and CMS flower bud transcriptomes combined with morphological microscopic analysis of anthers, provided a molecular understanding of anther development and identified key genes in a range of underlying processes, including tapetum development, sink establishment, pollen wall development and anther dehiscence. We also described the role of phytohormones in the regulation of these processes under normal fertile flower bud development. In parallel, we evaluated which processes are disturbed in CMS clones and could contribute to the male sterile phenotype. Taken together, this study provides a state-of-the-art industrial chicory reference genome, an annotated and curated candidate gene set related to anther development and male sterility as well as a detailed molecular timetable of flower bud development in fertile and CMS lines
    corecore